A Primer on Generative Adversarial Networks by Sanaa Kaddoura
Author:Sanaa Kaddoura
Language: eng
Format: epub
ISBN: 9783031326615
Publisher: Springer International Publishing
Data Collection and Preparation
Various GAN models can be used in deep fake video generation, such as speech-to-video or video-to-video GANs. Speech-to-video deep fake GANs mean generating videos of talking faces based on audio files and images of the target. On the other hand, video-to-video GANs involve generating counterfeit videos for a target individual with source and target person as a requirement. Thus, GAN will swap faces and voices in a video. In this section, video-to-video GANs will be explained.
To create a deep fake video, a dataset of videos of a real person is fed into the GANs. One way is to collect a large and diverse dataset of real videos, which can be the basis for generating fake videos. The dataset should have sufficient variability regarding different viewpoints, lighting conditions, backgrounds, and other relevant factors. Additionally, the dataset should be annotated to facilitate training and evaluation of the model. The dataset must contain different videos of the source and target speaker. The vocals and image of target speaker B should replace the vocals and image of source speaker A. Another way is to use public datasets. Various video datasets are available for this purpose, such as the FaceForensics++ dataset [5], which contains multiple individualsâ real and deep fake videos. The VoxCeleb dataset [6] is another viable option. It includes over 1,000 hours of audio and video recordings of individuals, making it suitable for training deep fakes that involve audio and video. There are various datasets online; however, VoxCeleb serves the problem. It consists of short clips of human speech extracted from interview videos uploaded to YouTube.
Once the dataset is collected, it must be preprocessed and formatted for training the GAN model. This includes resizing the videos to a consistent resolution, normalizing the pixel values, cropping the videos to remove unwanted parts of the frame, such as black borders, and splitting the videos into individual frames. Splitting the videos into individual frames is essential since working with entire videos can be computationally intensive and time-consuming. The frames can be further augmented to increase the variability in the training data by applying random rotations, zooms, and flips.
Download
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.
Deep Learning with Python by François Chollet(15489)
The Mikado Method by Ola Ellnestam Daniel Brolund(12750)
Hello! Python by Anthony Briggs(12603)
OCA Java SE 8 Programmer I Certification Guide by Mala Gupta(11894)
Dependency Injection in .NET by Mark Seemann(11700)
A Developer's Guide to Building Resilient Cloud Applications with Azure by Hamida Rebai Trabelsi(10534)
Algorithms of the Intelligent Web by Haralambos Marmanis;Dmitry Babenko(10510)
The Well-Grounded Java Developer by Benjamin J. Evans Martijn Verburg(10229)
Grails in Action by Glen Smith Peter Ledbrook(9818)
Secrets of the JavaScript Ninja by John Resig Bear Bibeault(9542)
Sass and Compass in Action by Wynn Netherland Nathan Weizenbaum Chris Eppstein Brandon Mathis(9249)
Hit Refresh by Satya Nadella(9040)
Test-Driven iOS Development with Swift 4 by Dominik Hauser(8549)
The Kubernetes Operator Framework Book by Michael Dame(8480)
Exploring Deepfakes by Bryan Lyon and Matt Tora(8301)
Robo-Advisor with Python by Aki Ranin(8256)
Kotlin in Action by Dmitry Jemerov(8250)
Practical Computer Architecture with Python and ARM by Alan Clements(8228)
Implementing Enterprise Observability for Success by Manisha Agrawal and Karun Krishnannair(8199)